Allow empty package paths on startup#1482
Conversation
|
| Status | Scanner | Total (0) | ||||
|---|---|---|---|---|---|---|
| Open Source Security | 0 | 0 | 0 | 0 | See details | |
| Licenses | 0 | 0 | 0 | 0 | See details |
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.
|
As i test the solution, i see the config.PackagePaths is not being validated, meaning i can use an empty config file so the filesystem indexers dont have any reference to watch. I guess before this change the "validation" was done because there was an empty list of packages and the init indexer failed. I think i should include here a validation for the config PackagePaths item, if the we allow empty these should not be empty, if we dont allow empty it will fail as it has been doing as the program wont find packages at an empty dir. WDYT? @mrodm @jsoriano |
Yeah, sounds good to check at least that these directories exist. |
Sometimes, at least locally I set this configuration to test just enabling the storage indexers ( package_paths: []Would that be valid in that configuration? Another issue, in the current configuration: Currently, the folder set in the configuration does not exist in the final docker image (e.g. It would be needed to ensure that this folder exists in the docker image (update the Dockerfile), or ensure to not fail those EPR services that just enable the storage indexer and therefore that folder does not exist in the container. |
Yes, this should be allowed if there are other indexers enabled. For the matter of simplification we can allow this always.
Makes sense. |
|
i've updated the PR with the use of fsnotify, i did not thought of it. I've done an implementation which virtually works but while testing i get some weird outputs.
i think it has something to do on having 2 watchers looking the same folder, probably a race condition... but i am not able to track it. Also, the mutex should be locking the read and write operations, but maybe when the write is done from one indexer, the read is locked from the other? 🤷🏻♀️ @mrodm @jsoriano could we look into this in a pair session? or if you could take a look by running the code yourself i think there are some details i am missing out. |
i've updated the check so it checks only when the flag to allow empty is enabled. in this case we need at least a directory to watch for changes |
…kageFileSystem method
…lays for file handle release and ensuring file sync before closing in createMockZipPackage function.
…dding delays for file handle release and ensuring file sync before closing in createMockZipPackage function." This reverts commit b1f30cc.
…source management
…cleanup after tests
| err = archiver.ArchivePackage(file, archiver.PackageProperties{ | ||
| Name: pkgName, | ||
| Version: "1.0.0", | ||
| Path: filepath.Join(tmpMockFSDir, pkgName, "1.0.0"), | ||
| }) | ||
| require.NoError(t, err) |
There was a problem hiding this comment.
The problem with tests in Windows might come from this function. It is creating the headers with native paths:
header, err := buildArchiveHeader(info, filepath.Join(rootDir, relativePath))
if err != nil {
return fmt.Errorf("building archive header failed (path: %s): %w", relativePath, err)
}
...
func buildArchiveHeader(info os.FileInfo, relativePath string) (*zip.FileHeader, error) {
header, err := zip.FileInfoHeader(info)
if err != nil {
return nil, fmt.Errorf("reading file info header failed (info: %s): %w", info.Name(), err)
}
header.Method = zip.Deflate
header.Name = relativePath
if info.IsDir() && !strings.HasSuffix(header.Name, "/") {
header.Name = header.Name + "/"
}
return header, nil
}But the Name in these headers should use forward slashes https://pkg.go.dev/archive/zip#FileHeader
You can try with:
header.Name = filepath.ToSlash(relativePath)There was a problem hiding this comment.
thanks jaime, this worked. when i tried this change locally there was other issue going on and i discarded it 🤦🏻♀️
… replace require with assert for length checks
mrodm
left a comment
There was a problem hiding this comment.
LGTM
Just a question about the transactions when the Go routine is started.
| i.logger.Debug(fmt.Sprintf("No watcher configured for %s indexer, packageList will not be updated", i.label)) | ||
| return nil | ||
| func (i *FileSystemIndexer) watchPackageFileSystem(ctx context.Context) { | ||
| // TODO: https://github.com/elastic/package-registry/issues/1488 |
packages/packages.go
Outdated
| } | ||
|
|
||
| if i.enablePathsWatcher { | ||
| go i.watchPackageFileSystem(ctx) |
There was a problem hiding this comment.
Should it be set the transaction as nil as in the other indexers ?
| go i.watchPackageFileSystem(ctx) | |
| go i.watchPackageFileSystem(apm.ContextWithTransaction(ctx, nil)) |
There was a problem hiding this comment.
looking into the call hierarchy i see there is a transaction initialized at initIndexer that is passed along the ctx object... so short answer is yes.
i've asked Copilot about it and it proposed me to use apm.DetachedContext(ctx) instead, giving me the following explanation, but from my pov this is not what we want, as we want to remove previous trace reference and init a new one inside the go routine. Also, the goroutine should be canceled if parent ctx is canceled.
WDYT?
The key differences between apm.DetachedContext and apm.ContextWithTransaction(ctx, nil):
apm.DetachedContext(ctx)
Creates a new background context that is independent of the original context's lifecycle
Preserves APM metadata from the original context (like trace/span IDs) while detaching from cancellation signals
Ideal for goroutines that should continue after parent completes - perfect for your go i.watchPackageFileSystem() use case
The goroutine won't be cancelled when ctx is cancelled
Visual representation:apm.ContextWithTransaction(ctx, nil)
Associates a transaction with a context (passing nil means no transaction)
Maintains parent context's lifecycle - if ctx is cancelled, the returned context is also cancelled
Used to explicitly set/override transaction tracking
Passing nil essentially returns a context with no active transaction
Visual representation:For Your Use Case
Using apm.DetachedContext(ctx) is correct here because:The watcher goroutine is long-running and should survive beyond the initiating request
You want to maintain APM tracing links without inheriting cancellation
The watcher has its own lifecycle management
If you used ContextWithTransaction(ctx, nil), the watcher would be cancelled prematurely when the parent context completes.
There was a problem hiding this comment.
i've asked Copilot about it and it proposed me to use
apm.DetachedContext(ctx)instead, giving me the following explanation, but from my pov this is not what we want, as we want to remove previous trace reference and init a new one inside the go routine. Also, the goroutine should be canceled if parent ctx is canceled.
I agree with you that it should be kept apm.ContextWithTransaction(ctx, nil) for both reasons you mentioned: new transaction should be initialized in the go routine as well as the go routine should be canceled if the parent ctx is canceled.
💚 Build Succeeded
History
|
Resolves #1351
This PR adds a flags to the package-registry in order to allow empty package directory when the registry starts up.
When this is enabled by flag, the filesystem indexers are watching for changes at the given directories. The watcher uses fsnotify, so any change on the configured directories will trigger an event.
The flag also enables an env to activate this behavour.
Testing
Tested manually creating an empty folder and add it to the config file. Running the registry without the flag should fail.
Using the flag and using the environment variables both make the registry run with empty folder.
When packages are added to the directory, the registry packages list is updated and they are available at the indexer